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Optimal Structures for Multimedia Instruction- 

Joseph Goguen and Charlotte Linde 
SRI International and Structural Semantics 

1 Introduction 

This report describes the first year's work in a two year study of optimal structures for 
multimedia instruction. The project has two phases. The first phase elicits experienced 
instructors' explanations of a demonstration device, in order to obtain for analysis a significant 
range of the possible discourse structures that occur in instruction. The outcome of this phase 
is a set of variables, and a set of hypotheses about relationships among them that lead to 
effective instruction. The second phase will test these hypotheses on groups of students. 

The aim of this project is to provide experimentally validated guidelines both for the design of 
computer-based instruction generation systems, and for human instruction in a multimedia 
setting. Potential applications for this research include multimedia output capability (e.g., 
graphics output plus audio, using speech technology) for automatic instructional systems and 
for onboard fault diagnosis systems, as well as the improvement of traditional classroom 
instruction. 

Four major results have been achieved so far: (1) a framework for discussing optimal discourse 
structures and/or visual presentations in multimedia instruction, based upon the notion of a 
mapping between semiotic systems, as discussed in Section 3.4; (2) the discovery that the 
command and control speech act chain is used in "hands-on" instruction (our structure theory 
of this discourse type is given in Appendix II); (3) a rich set of experimental hypotheses, given 
in Section 6; and (4) a demonstration of the viability of a methodology combining linguistic 
analysis with experimental research. 



We would like to thank Marshal! Farr and Henry Halff, of ttie Office of Naval Research, for helping to 
conceptualize and focus this project, and our consultants Tora Bikson and James Weiner for their suggestions 
and help throughout the work. 
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1.1 Method 

This research concerns instruction, particularly mixed media explanation, involving for 
example, language, diagrams, and demonstration equipment. The approach draws from 
linguistics (specifically, discourse analysis), experimental psychology, and philosophy 
(specifically, semiotics). The purpose of this subsection is to provide enough information on 
what we are doing so that the reader can follow the explanations and examples below. Most 
of this subsection concerns the Phase I experiments already completed. See Section 4.1 and 
Appendix I for some further details, especially regarding our Phase I pilot experiments and our 
plans for the rest of the project. 

1.1.1 Task 

Instructors are given a "logic box" having four lights and two switches (see Figure 1). Each 
light realizes a different logical function of the two switches. (Note that there are sixteen such 
functions, of which just four are realized in the actual box.) Instructors are also given a 
blackboard with colored chalk. Their task is to explain to students how to use the logic box; 
students are to set the switches so as to achieve some given configuration of lights. After 
several trials, we developed the following approach: Students are told that they are being 
trained to control an irrigation system producing a continuous flow of fertilized water, and 
that each light indicates whether or not a certain fertilizer is being mixed into the current 
product. Their job will be to set the switches, upon receiving a telephone call describing the 
desired mixture. 

The explanations elicited from instructors in this way are then subjected to formal linguistic 
analysis, to identify significant variables and to formulate interesting hypotheses about the 
relations between the form of the instruction and subsequent performance by students. The 
most suitable of these hypotheses will be tested in Phase II to determine which instructional 
structures have the most favorable effects on learners' performance. 

This task was chosen because it can be presented by instructors using a wide variety of media 
mixtures (e.g., spoken language and written language; charts, equations and diagrams ou the 
blackboard; and direct use of the demonstration device) at several different cognitive levels 
(including concrete operational and abstract Boolean algebra; see Section 4.1). Moreover, it is 
fairly easy to design and score instruments to test the effectiveness of a given instructional 
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Figure 1: The Logic Box 

technique using student comprehension^ as a dependent measure. Although the task that we 
have chosen may seem relatively simple, fact instructors exhibited surprising variability in 
its performance; in addition, it is typical of subtasks of larger instructional tasks, and we 
believe that the results obtained from its Analysis will generalize to far more complex 
situations. 

. 1.1.2 Procedures 

Subjects were instructors from the Engineering Department of a local community college. 
Before the instructional session began, one of the experimenters presented the logic box 
instruction task to the instructor; the logic box itself was explained to instructors using a 
circuit diagram, with the reniark that this would probably not be an appropriate explanation 
for students. This leaves thejinstructor free to determine a more concrete level of description 
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for the students. In the first two experiments, the instructor had an audience of community 
college students with little or no previous exposure to engineering subjects. (This population 
is similar in background and age to most novices entering technical training programs, such as 
in the Armed Forces.) The experimenters played the role of students in subsequent 
^^experiments, since the students' questions proved to be insufficiently focussed to elicit the 
desired range of responses. ^ ^ 

• Five instructors were used as subjects in six experiments. This series of experiments 
increasingly refined our experimental technique. All experiments were recorded on audio tape 
and then transcribed, yielding a total of 124 pages of transcript for the instructor briefing and 
subsequent instructional session. (There are also debriefings for students and/or instructors 
for some sessions.) Each such session 'lasted between one-half and one hour. The first two 
experiments used the same instructor, and also used groups of community college students as 
an audience, 4 students in the first and 5 in the second. 

1.1 .3 Results 

Analysis of this corpus resulted in the theory given in Section 4, the variables given in Section 
5, and the hypotheses given in Section 6. In addition, we discovered that our theory of the 
command and control speech act chain [Structural Semantics 83] was directly applicable to the 
structure of hands-on instruction; see Section 2.6 and Appendix II. Finally, we became 
convinced of the necessity to study the visual component (as discussed in Section 1.2) and 
were inspired to begili a systematic study of optimal representations based on semiotics (see 
Section 3.) 

1.2 The Visual Component of Explanantion 

Our research to date, and indq^ed most research on explanation, has concentrated on the 
analysis of the verbal component, since this appears to be the dominant component, 
controlling many aspects of those other modalities that may be present. However, we have 
now found that it is necessary to analyze the visual component as well as the verbal. This 
subsoclion discusses some reasons for this, and some probable practical results. 
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1.2.1 Indications of the Necessity for Study of the Visual Modality 

We had initially decided not to iise video in these experiments, since it is well known that 
video analysis is a difficult and lengthy process. However, our pilot experiments have made it 
clear that some of the most theoretically interesting and practically important issues, such as 
the interaction between linguistic and visual modalities and the reasons why some visual 
materials are more effective than others, can only be studied using video-taped data. 

1. The Ubiquity of Diagrams. In our instructions to instructors, we indicated that if 
they wished, they could use the blackboard. All five instructors made significant use of 
both the blackboard and the demonstration device, and also employed a variety of 
referring expressions in oral explanations, such as it, this one, that one the 
bottom case (referring to the last row of a table). It is difficult or impossible to 
understand the meaning of such expressions without video. 

2. Visual Deixis. One of the most important problems in linguistic theory is reference 
and the accomplishment of reference. This problem becomes additionally complicated 
when reference can be accomplished not only, with referring expressions, but also by 
means of visual deixis — pointing, gaze direction, etc. In a subject domain in which a 
considerable amount of the material to be conveyed is in visual form, visual deixis 
becomes so important a form of reference that it must be studied in order to understand 
the mechanisms of communication. 

3. Anomalies in the Visual Modality. Our current research has revealed a number of 
interesting anomalies or errors in the construction of diagrams. One such anomaly is the 
case of a subject who made visual reference to parts of the diagram by pointing to its 
parts, before he drew the diagram on the beard. (Our experience of lectures, classroom^ 
explanations, etc, indicates that this practice is more common than might be believed.) 
Another interesting case is that in which the diagram differs from the verbal explanation 
— either because the diagram is incorrect, or because the explanation is incorrect. It 
seems clear that these unfortunately fairly comriron types of error must have a 
considerable effect on learning. In addition to this fairly obvious hypotheses, it would 
a^o be of great interest to see whether these types of errof are themselves dependent on 
some other variables in the communication situation. 

1.2.2 Practical Reasons to Study the Visual Modality 

In addition to these considerations from the experiments already performed, there are also 
strong practical reasons for studying the visual modality. The first is that the relation 
between the verbal material and the visual models should yield a number of hypotheses which 
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\vould be easy to test, and "which would be rather general in scope. In addition, such 
hypotheses should lead to valuable suggestions for training in multimedia instruction, since it 
is known that most people, including most instructors, receive no training in the production 
and use of effective diagrams, visual models, etc. 

1.2.3 Theoretical Basis of the Study of the Visual Modality 

Linguistics provides a wealth of techniques for the analysis of spoken language, but 
unfortunately there is no comparable body of theory for the analysis of graphical or mixed 
media data. We found that, while semiotics does provide a good starting point, it lacks a 
theoretical framework for describing systematic ways of representing signs from one sign 
system with signs from another system (e.g., representing states of the logic box by rows of a 
table on a blackboard). This led us to develop the notion of a "semiotic morphism" described 
in Section 3.4. ^- Now that this theory is available, we are able to design experiments especially 
suitable for collecting video data, and in Phase H, ;we will perform at least two such 
experiments before moving on to testing hypotheses a^ut which representations and 
structures have the most favorable effects on learners' performance. The experiments of 
Phase n will involve showing instructional videotapes constructed according to specific 
principles to groups of experimental subjects, and then administering a single post- 
instructioual instrument. 

c 

1.3 Related Research 

The present study has connections with many other projects in cognitive science, psychology, 
and artificial intelligence. Part of our analysis is to understand the conceptual structure of the 
task and the task domain. This is similar to [Stevens & Steinberg 81, Stevens & Collins 
80, Centner & Centner 82], which provide conceptual models for complex knowledge domains 
and the possible explanations that can be given of them. Similarly, [Kieras 82, Kieras & 
Bovair 83] study the organization of knowledge schemas for electronic devices and the effects 
of different mental models on how to operate such devices. An early important study in this 
area is [Grosz 77], which established the hierarchical nature of peoples' knowledge of task 
domain structure for problems like water pump repair. 

The present study differs from these preceding studies in emphasizing structural analysis of 
both the semantics and the syntax of interaction, particularly explanation. We find that 
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complex communicative situations involving multipie semiotic systems cannot be analysed 
intuitively or on the basis of our knowledge as members of the culture; thus formal, 
theoretically-based analysis is required. 

Many additional references are given in the body of this report in connection with specific 
topics such as semiotics. 

2 Discourse Analysis 

This section reviews some of the concepts, theories, and techniques from linguistics that are 
used to analyze the explanations elicited, in order to provide speciHable and quantifiable data 
for further research. We first discuss the basic notions of discourse unit and discourse type, 
and the kinds of rules that apply to them, and then discuss the known discourse types found 
in our data, mentioning some particularities that these discourse types exhibit in instructional 
discourse. Many of our examples are taken from the study of aircrew 
communication [Structural Semantics 83] since the complete range of examples is not yet 
available for instructional discourse at the present stage of research. 

2.1 Discourse Unit and Discourse Type 

Discourse analysis, the study of linguistic units larger than the sentence, is used in this study 
because it appears that the discourse unit, rather than the sentence, the word, etc., is the 
linguistic level of greatest significance in effective instruction. Specifically, we have found that 
reasoning, pseudonarrative, and the command and control speech act chain are the most 
relevant discourse types in our data. These are respectively discussed in Sections 2.3, 2.4 and 
2.5 below. 

A discourse unit is a segment of spoken language, composed of one or more sentence, having 
socially recognized initial and final boundaries, and a formally definable internal structure. 
(This definition generalizes the criteria given by [Labov 72] for the narrative of personal 
experience.) Other discourse units that have been studied include pseudonarratives, 
specifically spatial descriptions [Linde 74, Linde & Labov 75], plans [Linde & Goguen 78], 
jokes, and explanations [Weiner 79, Goguen, Weiner & Linde 81]. It is rare that an entire 
discourse unit consists of a single sentence; more often, it appears as a several sentences, a 
question-answer pair, a question-answer-evaluation triple, etc. A discourse type is a theory 
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of the structure of a class of discourse units, that is, it provides a way of recognizing whether 
or not a given segment of language is an instance of the type. Thus, we can think of a 
discourse type as the class of discourse units that satisfy a given theory. This corresponds to 
the familiar distinction between type and token. 

Discourse analysis studies the structure of discourse types. In order to apply it to the question 
of how people actually use discourse units, there are a number of further requirements on how 
the research should be conducted, and in particular, on the descriptions to be used for the 
discourse units involved. First, the work must based upon a careful enipirical analysis of 
actual human discourse in natural situations. This means in particular that we cannot use 
invented examples to develop our theory (although such examples could be used to illustrate 
it). Secondly, it is necessary to have a mathematically precise description of the discourse 
structures of interest. Without this, we cannot properly test hypotheses involving variables 
that refer to discourse structure. Third, a suitable theory must also provide a simple and 
natural taxonomy of the parts that can occur in a given type of discourse, and of how these 
parts relate to one another. Each of the discourse types that has been studied has certain 
characteristic parts, and also certain characteristic relationships of s ;ibordination among these 
parts. 

For example in reasoning, one statement may be subordinate to another statement by the 
relationship of providing a supporting REASON, as in the following example, where the 
second statement supports the first. 

(la) If your memory is short, the best thing to do is 

to construct a table 
(lb) so that you know exactly what what the output would 

be. 

Other kinds of subordination that can occur in reasoning include a subordinate statement 
serving as an EXAMPLE (i.e., an instance) of a statement, and several statements serving in 
conjunction or disjunction, supporting the same statement. There is also ALT subordination, 
indicating that two subtrees represent alternate possible worlds. 

Such an organization of discourse units into parts that are connected by relationships of 
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subordination is easily and naturally represented by a tree istructure. This offers a 
convenient, graphically suggestive, and mathematically precise way to represent hierarchical 
subordination. In this representatton, the top node represents the whole discourse, and its 
immediate subordinates represent the first subdivision into parts. For example, in reasoning 
the top node is a STATEMENT/REASON node indicating a division into two major parts, 
the first a statement of the proposition to be established, and the second a structure of 
propositions supporting this statement. Labels on nodes distinguish the different kinds of 
subordination that occur; these labels are called subordinators. Such a labelled tree 
structure closely resembles the tree structure of a mathematical proof of the assertion at the 
STATEMENT subtree of the root. 

A fourth feature of discourse that an adequate theory should model is the construction of 
discourse units in real time. For this purpose, it is necessary to have a notion of the present 
focus of attention, in order to be able to indicate to what previous part a new part is to be 
subordinated, as discussed in the next subsection. 

2.2 Transformation and Focus of Attention 

The real time aspect of discourse is especially important for any study of the interactive use of 
language. The process of discourse construction is modelled by transformations on the tree 
structure that represents the discourse structure. Such transformations can add, delete, or 
alter a discourse part. It is particularly important within the instructional context to have a 
formal description of such processes, since they represent important variables of instructor- 
student interaction and of instructor self-correction. 

For example. Figure 2 shows the transformation that constructs a tree representing a text of 
the form SI since S2 as in Example (la-b) above. It begins v ith Si, If your memory is 
short, the best thing to do is to construct a table which is then subordinated by 
a STMT/RSN node as the transformation adds the statement S2, so that you know 
exactly what what the output would be supporting Si. 

Transformations are very familiar in the literature of linguistics [Chomsky 65]. However, they 
are most commonly applied to the structure of sentences, rather than to larger discourse 
structures. Also, such transformations have not been used to model the real time construction 
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STMT/RSN 

., / \ 

SI S2 
Figure 2: A Simple Transformation 

of syntactic structures, but rather have been postulated as part of an abstract mechanism for 
generating syntactic structures. 

The focus of a discourse represents the presumed focus of attention of the participants at a 
given point in a discourse; it might be described intuitively as "where we are now." 
Graphically, we represent the current focus as a "*" at a particular node on the tree.^ [Grosz 
77] discusses a notion of focus that is primarily semantic and is useful for resolving pronoun 
references; it involves a hierarchical structure of "focus spaces" that is similar to the use of 
embedded pointers in our theory. 

There is one very important connection between focus and transformations, a constraint on 
how discourse structure can be built up in real time: a transformation can be applied only at 
the node currently in focus. This constraint on the application of transformations corresponds 
to speaker's and hearer's expectations about what will occur next. In particular,, a 
transformation cannot be applied to a part of the tree developed earlier without first moving 
the pointer back to the appropriate subtree. Some transformations, in fact, only accomplish 
pointer movement, i.e., they just change the focus of attention, and thus do not add any 
semantic content to the tree. 



Actually, more than one pointer is needed for some transformations [Gbguen, Weiner & Linde 81]. We have 
found constructions in explanation much like those called "parallelism* in classical rhetoric, where there is not 
only an active node of focus, but also a passive node; in these constructions, some transformations reverse the 
active and passive nodes, so that addition can proceed alternately among two subtrees. Markers such as *on the 
other hand" are used to switch to the other subtree. There are even cases where more than two pointers are 
needed; for example, if one parallel construction is embedded within another. However, such constructions can 
be quite difficult to understand, and we have not found them in instructional discourse. 
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This general theory of the structure of discourse types is the basis for the particular theories of 
reasoning, pseudonarrative, and command and control speech act chain present in our data. 
We now turn to the first of these. 

2.3 Reasoning 

Reasoning (called explanation in some previous studies) is the most frequent discourse type 
found in instructional discourse, and has been studied previously using accounts of income tax 
decisions, career choice, and the probable effect of taking certain political decisions [Weiner 
79, Gogiien, Weiner & Linde 81] as data. In the current study, we call this discourse type 
reasoning and reserve the term explanation for a broader social function. This social 
function of explanation can be accomplished by many discourse types including reasoning, 
narrative, pseudonarrative, and planning. For example, a question like Why are we learning 
this? might be answered with a pseudonarrative about the mixtures of fertilizer, or with a 
reasoning structure to show that a correct approach is being taken. Either of these could 
function as an explanation. 

Figure 3 shows an analysis of a simple instance of reasoning from the domain of aircrew 
communication in which the flight engineer reports his justification to ground control of the 
decision not to recycle the landing gear after an initial attempt to bring the landing gear down 
has failed. 

The most important relationship of subordination in reasoning is indicated by the 
STATEMENT/REASON node. In the reasoning structure displayed in Figure the main 
STATEMENT is Don't recycle the gear. Everything that follows is a REASON 
supporting this. The ALT node represents the speaker's postulation of two alternate worlds, 
differing by whether or not the landing gear is broken. This ALT node is established by the 
underlined portion of (2). (The number in parentheses refers to the time in the flight recorder 
transcription of the United Airlines flight 173 crash near Portland Oregon on December 28, 
1978; this convention is also used in subsequent examples taken from the same flight.) 

(2) .. .we're reluctant to recycle the gear for fear 
something is bent or broken . (1752:16) 

The phrase for fear indicates indicates both the uncertainty about whether the gear is bent, 
and the decision to treat the alternate world in which it is bent as the one on which attention 
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/A 



don't recjcle ALT 
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(not bent or REASON/STATEUENT 
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/ A. 

OR not able 

to get it 
down 



A 



bent broken 



... we're reluctant to recycle the gear for fear something 
ie bent or broken^ and we won't be able to ^et it down 

^ (1761:16) 

Figure 3: A Reasoning Tree 

is focussed; in fact, the world in which the gear is not bent or broken is only implicit in the 
text of this example. 

Figure 4 shows the node types found in reasoning, including EXAMPLE, which is not present 
in the example of Figure 3. An EXAMPLE node takes as its subordinates one or more 
examples of a statement. 
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Figure 4: The Subordinators Found in Reasoning 



2.3.1 Summary Nodes in Reasoning 

Although we found that the theory of reasoning structure developed in [Goguen, Weiner & 
Linde 81] applies to the units of reasoning in the current dataset of instructional discourse, we 
also made one addition that is very helpful. This is a new branch type, called SUMMARY, 
that subordinates capsule descriptions or summaries; the symbol 17, Greek sigma, is used to 
label these branches. Nodes that involve this new kind of branch include i7/STATEMENT, 
STATEMENT/r, r/STATEMENT/r, STATEMENT/REASON/r, and 

i7/REAS0N/STATEMENT. Some hypotheses concerning the placement and structure of 
summary branches are given in Section 6. (3) is a typical example of reasoning, and contains 
several summaries. It is a response to a student's complaint that previously given explanations 
were too simple.^ 

(3) You can get a little &ore complicated. Like jou 
can think about what the individual lights do. 
One of the interesting things with this one« it 
doesn't depend upon this switch at all. Like to 
see [inaudible] Switch Two. So if it's off when ^ 
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The reader unfamiliar with transcriptions of spoken language may find (3) incomprehensible. It is, however 
quite characteristic of spoken data. It is immediately comprehensible when heard on the tape; with practice, the 
written version of such data becomes familiar and accessible. 
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Switch Tvo is down« it's on when Switch Two ie 
iip« independent of wh&t happens with Switch One. 
So« Switch Two controls light C. You know « let\ 
6ee« what'0 another one? I think D says that the 
two switches are the same ... so if they^re both 
down or both ap« D is on« but« if they're 
different* D is off. I think A« if I remember 
right « A is« is th-« no B is that thej're 
both down. Uh« in any other position B is off. 
And then there's a more complicated one. What 
was that? If A's« A says that it's not anything 
other than One up or Two down* then it's on. 
Then that's off. It gets a little bit more 
complicated when you explain it that way. 

The structure of this reasoning unit is shown in Figure 5. Notice that two nodes of this 
structure have summary branches; the top (root) node actually has two summary branches, 
one given before the body of the explanation and the other given after it. The body of this 
reasoning unit consists of an AND that subordinates four reasoning sub-trees, one for each 
light on the box. A summary is given after the first of these sub-trees. It is interesting to 
notice that the order in which the lights are considered is here not their physical order on the 
box (which would be A, B, G, D, going left to right); rather, the lights are discussed in order ol 
increasing psychological complexity (although not based on any firm evidence, this order 
appears to be C, D, B, A; using the convention that Sn is the predicate "switch n is up," C is 
just ■S2"; D is ■S1=S2", B is "(not SI) and (not S2)," and A is "Si or not S2."). Two 
semiotic systems that are involved here are (1) the system of things that are observable about 
the lights, and (2) the discourse system in which these observations are explained. This 
example is suggestive for Hypothesis 6 in Section 6, that optimally comprehensible 
explanations do not preserve relations (such as ordering by complexity) at the expense of basic 
constructors (in this case, the physical placement of lights). Discourse elements are ordered by 
the time of their production; and the lights can be ordered either by the complexity of theii 
logical functions, or by their physical placement. The explanation would have been more 
comprehensible if the physical ordering had been used. 
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Figure 5: Structure Tree for an Instance of Reasoning 
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2.4 Pseudonarrative 

In addition to reasoning, instructional discourse ako contains a number of instances of 
pseudonarrative. Pseudonarrative is a discourse type having some but not all of the 
characteristics of spontaneous oral stories, the discourse type of which is called narrative. 
Like narrative, pseudonarrative relies on the narrative presupposition, the rule of 
interpretation stating that the order of main clauses is to be taken as the order of the events 
that they describe. Also like narrative. The pseudonarrative type permits optional initial 
summaries, closing evaluations, and end markers. The difference is that narrative consists of 
past tense main clauses, referring to actions understood as actually having happened, whereas 
pseudonarrative refers to hypothetical, potential, or habitual actions. (4) is a 
pseudonarrative from the data of the current study. The reader may be note a lack of 
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redundancy in this example; however, some redundancy is provided by the preresence of a 
diagram on the blackboard summarizing the same material 

(4) We'll go through it again slowly. O.K. In one 

position of these two switches, where One is up and 
Two is down, all these lights are off. We take 
Switch One and move it to its other position, three 
of the lights come on. We reverse this combination 
and make them both go up, we got again three of the 
lights are on but a different three lights. And, 
if we move em till both down, again, we got three 
lights on but these two lights are changed over. 

Note the narrative structure imposed by the use of the personal pronoun we and the active 
verbs go, take, move, make, indicating actions rather than states. 

Pseudonarrative has previously been studied in the domain of apartment layout 
descriptions [Linde 74, Linde & Labov 75]. In this domain, speakers commonly use 
pseudonarrative structure to convert spatial organization to temporal organization. It was 
found that they used a spatial organization far more frequently than a temporal organization, 
and made far fewer errors in the temporally organized descriptions, suggesting that the 
pseudonarrative organization is easier to produce and understand. 

In the present domain, pseudonarrative offers a considerably simpler alternative structure to 
that of reasoning. Structurally, this simplicity is reflected in the fact that pseudonarrative has 
a broad tree structure rather than a deep one; i.e., it has fewer complex subtrees. The choice 
between pseudonarrative and reasoning yields the general hypothesis that broad trees are 
more comprehensible than deep trees, sinc^ the load on memory is less [Yngve 60], and the 
specific hypothesis that pseudonarrative is simpler for novices than reasoning structure, since 
the discourse organization will be more familiar. 
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2.5 Command and Control Speech Act Chains 

The command and control speech lot chain provides the simplest way of accomplishing several 
important forms of complex social action, and is important in the study of instruction 
discourse, since any sequence in which an instructor requests a student to do anything and 
then receives a response from the student constitutes a command and control speech act chain. 
Such sequences are thus the basic discourse type for hands-on instruction. 

We define a speech act chain to be a maxmal sequence of speech acts, each of which has the 
same major propositional content. (This discussion relies on [Searle 69] in its use of the terms 
■speech act" and "propositional content.") One specific form of speech act chain constitutes 
the command and control speech act chain, which has been studed as the basic discourse type 
for aircrew discourse [Structural Semantics 83]. Appendix 11 gives a technical discussion of 
such command and control speech act chains, including the categories of utterance, the 
subordinators that are used, and the rules that govern sequencing. 

Example (5) illustrates the use of the command and control speech act chain in an 
instructional context. 

(5a) Instructor: And if jom had a question* now jom could 

ask a question. 

(5b) Student: Um- 

(5c) Instructor: You could say what« what kinda controls 

do 1 have. What can 1 do with 
that box. 

(5d) Student: How come when you've got both of the switches 

off« you have some lights on? 
(5e) Instructor: Both of the switches, off? 
(5f) Student: See you've got them off. 

(5g) Instructor: Okay well that isn't necessarily off, that's 

just down. It« maybe you really wouldn't 
wanna say up and down rather than on and 
off « might be a better way of saying it. 
Does that make sense? . . . 

(5h) Student: Umm-hmm. Yeah. 

(GS, p. 10) 
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In this example, (5a) and (5c) are requests for action, (5d) and (5e) are requests for 
information, (5f) is a challenge, (5g) is a statement followed by a request for information, and 
(5h) is an acknowledgement (italics indicate emphasis). 

The study of speech act chains in an instructional context is of general interest in 
understanding classroom discourse [Griffin & Mehan 81, Sinclair & Coulthard 75], and is of 
particular importance in the understanding of hands-on instruction. 

3 Semiotics 

This section presents our preliminary investigations into semiotics. The present project is 
concerned with optimal structures for multimedia instruction; semiotics is a natural theoretical 
framework for such an investigation, since it attempts a general theory of all sign systems, 
including language, diagram, gesture, etc. As this work is inspired by our experimental 
program rather than by philosophical analysis, it has a' preliminary character, and we expect 
that there will be reformulations as our experimental hypotheses are tested and refined. 

We begin with a short introduction to semiotics based on the thought of Charles Saunders 
Peirce, followed by a short discussion of the relation between semiotics and linguistics that 
includes a review of Saussure, a founding figure with Peirce in the study of semiotics. We 
next formulate precise notions of "semiotic system" and "sign system." The main concept 
then follows, that of a "semiotic morphism," which is a translation from signs in one system to 
signs (i.e., representations) in another. Finally, we consider what makes some translations 
better than others. 

These ideas should be applicable to many aspects of instruction as well as to other areas of 
communication; some instances include generating appropriate explanations, determining good 
representations ("icons") for computer graphics, measuring the quality of analogies, and 
choosing good names for files in a directory. There is a very large literature that is relevant to 
problems like these; however, all studies that we know are restricted to particular a semiotic 
system (such as natural language) and/or a particular semantic domain, or else lack the 
precision of the theoretical model that we will present. Some recent research, however, is 
fairly close to ours in spirit; in particular, the last two problems listed above have been studied 
by (Centner 83], [Centner & Centner 82] and [Carroll 82], respectively, who reach conclusions 
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compatible with ours; in particular, they emphasize the importance of structure as opposed to 
content. However, their theoretical framework is less rich than ours, which includes 
hierarchical levels and constructor functions, as well as objects and relations. 

# 

3.1 Peirce 

This subsection briefly reviews some ideas from Peirce's approach to semiotics [Peirce 65], 
since our approach is partially based upon his ideas. We try to use Peirce's original 
ter • :nology and definitions since his work is often superior to later popularizations and 
extensions. On the other hand, his exposition is difficult and his work has not yet been 
thoroughly assimilated into current philosophical thought; furthermore, the insights of modern 
linguistics and our research on multimedia instruction have suggested certain additions and 
reformulations that we do not attempt to distinguish from Peirce's original concepts. 

A basic concept in Peirce's semiotics is semiosis, an instance of signification, which is a 
situation involving the following three main components: 

1. a sign, "something which stands to somebody for something in some respect or 
capacity;" 

2. an object,. that for which the sign stands; and 

3. an interpretant, which is another sign, raised by the original sign in the mind of 'the 
interpreter, which is "directly applicable to self-control," i.e., to its pragmatic use. 

Peirce's original terminology, apparently based on the Medieval trivium, was "pure 
grammar," "logic proper," and "pure rhetoric." These concerned respectively: necessary 
conditions for meaningfulness, necessary conditions for truth, and "the laws by which ... one 
sign gives birth to another, and especially one thought brings forth another" [Peirce 65] . 

Peirce calls the "logical interpretant" of a sign (as opposed to its "emotional" interpretant) 
the meaning of the sign. This notion suggests the modern concepts of "knowledge 
representation;" similarly, his "pure rhetoric" strikingly resembles the concerns of modern 
"knowledge engineering" and expert systems. The following quotation might almost be a 
modern computer scientist (rather than a "pragmaticist"^) discussing requirements for the 
knowledge representation system of a robot: 



Peirce introduced the term "pragmaticism" in the hope that it would be so awkward that no one would copy 
it, as William James had his earlier term * pragmatism.* 
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The rational meaning of every proposition [which is a well-formed complex of 
signs] lies in the future. How so? The meaning [i.e., the logical interpretant] of a 
proposition is itself a proposition. Indeed, it is a translation of it. But of the 
myriads of forms into wKich a proposition may be translated, which is that one 
which is to be called its>ery meaning? It is, according to the pragmaticist, that 
form which is most directly apphcable to self-control under every situation and to 
every purpose. u 
Certainly Peirce did not have in mind the fruits of modern cognitive and computer science, 
such as semantic networks, relational databases, non-monotonic logic and rule-based systems. 
But these appear to be consistent extensions of Peirce's thought in the direction of further 
precision, applicability and effective computability; in any case, we shall ourselves move in 
this direction. We note that it is the goal-directed content or application of such structures 
that is particularly relevant herei 

Some further comments on the notions of sign and object are in order. Peirce's "objects" are 
not limited to physical objects, but also include relations and properties as possible 
designations for signs. In considering computer generated instruction, we shall probably also 
want to use less traditional and more complex entities from the ontology of modern computer 
science, such as "continuations" and procedures or algorithms. 

Peirce had a good deal to say about the nature of signs, much of it very relevant to our study 
of multimedia instruction. Let us first consider his influential threefold division of signs into 
icons, symbols and indices, according to the manner in which they signify. Peirce defines an 
icon as a "sign which refers to the Object that it denotes merely by virtue of characters of its 
own," "such as lead pencil streak as representing a geometrical line." A sign x is an index 
for an object y if x and y are regularly correlated, in the sense "that always or usually when 
there is an there is also a y in some more or less exactly specifiable spatiotemporal relation 
to the X in question" [Alston 67]. "Such, for instance, is a piece of mould with a bullet-hole in 
it as sign of a shot" [Peirce 65|. Finally, Peirce defines a symbol as a "sign which is 
constituted a sign merely or mainly by the fact that it is used and understood as such." 

Peirce did not finally believe that the best interpretant of a given sign is necessarily another 
sign. The argument goes as follows: Any given sign can always be "further developed" into 
another sign that is its interpretant. This leads to an infinite sequence of signs. If at some 

27 ' , 



21 



point the next interpretant in this sequence is the same as the previous one, then we may 
regard the sequence as finite. But in some cases it is genuinely infinite, and there is no final 



interpretant. Instead, Peirce (sometimes) took what he called habit, the 'readiness to act in 
a certain way under some circumstances and when actuated by a given motive" as "the 
veritable and final logical interpretant." Since this is not a sign, it need not be further 
interpreted. Notice that this formulation is also quite consistent with modern procedural 
approaches to knowledge representation. 

3.2 Semiotics and Linguistics 

Although semiotics is intended to be the general study of all sign systems, almost all studies of 
semiotics have taken language as the primary semiotic system. There are several reasons for 
this. First, the units of language are both familiar and easy to discern. Letters, words, and 
parts of speech have been known and analyzed as formal units at least since Aristotle, and the 
additional units added by modern linguistics are relatively well agreed upon. In contrast, 
kinesics, or body language, after more than forty years of study, still shows no agreement on 
what its units are, or whether or not the system is independent of parallel activity in the 
linguistic system. Second, of known semiotic systems, language appears to have the greatest 
number of hierarchical levels, with the greatest number of units instantiating each level, and 
also the fullest body of theory and of methods for studying the chosen domain. Therefore, 
many studies of other semiotic systems and many theoretical formulations of semiotics as a 
meta-theory take linguistics as a model. For example, see (Worth & Adair 73] and [Carroll 80) 
studying the semiotic system of film, and [Barthes 57] studying cultural systems, such as the 
meaning of steak in French cuisine. 

3.2.1 Saussure and Peirce 

Most work in semiotics derives from either Saussure or Peirce. A major difference in their 
approach [Eco 79] is that Saussure gives a two category account of semiosis, involving only 
signs and meanings, while Peirce gives the three category account discusse^d above. Their 
work also differs greatly in emphasis: as a linguist with a background in historical linguistics, 
Saussure focusses on the paradigmatic relations of signs within sign systems; while Peirce, as a 
logician and philosopher interested in pragmatics, considers signification; interpretation^a^d 
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systems include, for example, the paradigm consisting of all case markings possible for a noun 
of a given class, the paradigm of all relative pronouns, and the paradigm of all Boolean 
functions. In contrast, the syntagmatic plane of organization consists of the actual order of 
signs as they are used in real time, for example, the order and syntactic organization of words 
in a sentence, or of symbols in a logical proof. Clearly, both planes of organization are 
necessary to understand a sign system. 

Perhaps the most characteristic of Saussures's contributions to the study of semiotics is an 
emphasis on and extension of the notion of the arbitariness of the sign [Saussure 74). 
Traditionally in linguistics, arbitrariness has meant the absence of a necessary connection 
between sound and meaning. That is, the sound sequences 'arrow' or 'fleche' or 'pfeil' can all 
express the meaning there is no necessary connection between the sound and the 

meaning. However, Saussure's work shows that although the sound/meaning connection is 
arbitrary (with the minor exception of onomatopoiea), the range of meaning of a given sign is 
not arbitrary, but rather is constrained by the meanings of all related words. For example, 
the set of color terms in a given language forms a mutually constraining semiotic system 
within the lexicon. Therefore, it is not possible to give the meaning of an individual color 
term in isolation. Thus we cannot understand the full range of the English term 'brown' from 
just a definition and examples of brown. We must also know what it is not, and so must also 
understand red, black, yellow, and all the other terms in the same system with which brown 
contrasts. 

These notions of the arbitrariness and mutual constraint of paradigm members will become 
important when we study particular multi-media semiotic systems, since they can help, to 
answer questions about the boundaries of a given system, their degree of arbitrariness, etc. 

3.3 Semiotic Systems 

This subsection gives our notion of semiotic system, inspired by Peirce's formulation of 
semiosis; it will be seen that most of our discussion concerns the structure of complex signs. 
Our exposition is gradual, and is finally summarized by a reasonably precise definition of 
semiotic system, Some formalization of the structure of a system of related signs is needed in 
order to study what makes one representation of a given sign better than another. This is 
because it is necessary to consider how related (but significantly different) signs will be 
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represented in order to avoid confusion and ambiguity; and it is necessary to consider what 
attributes of signs should be given priority in constructing their representations. (These issues 
of representation are not explicitly addressed until Section 3.4.) 

In all but the simplest sign systems, individual signs are organized into compound signs; for 
example, sentences are made of words. This is a fundamental strategy for rendering the 
complexity of non-trivial communication more manageable. One may iterate this strategy, by 
regarding complex signs at one level of analysis as individual signs at a higher level, and then 
forming compound signs from these as well, which leads to a multilevel hierarchy of sign 
structure. For example, linguistics recognizes the following levels: phonological (the sounds of 
a given language); morphological (the smallest repeated compounds of phonemes with a stable 
meaiiing); lexical (words); syntactic (phrases and sentences); and discourse (multisentential) 
units. It is important to note that this is a "whole/part" hierarchy, in which items at each 
level are composed from components from the next lower level. Such a hierarchy is therefore 
quite different from Peirce's three-fold division of semiosis, which focusses on the meaning of 
signs at a given level. 

Sometimes there may be one or more basic levels, which are somehow most important or 
characteristic of a given semiotic system. In the case of natural language, for the last thwnty 
five years, it has been supposed that the sentential level is basic, although more recent 
research is inclined to regard the discourse level as at least as important. More generally, 
there may be a partial ordering relation upon the levels of a semiotic system, such that some 
levels are more basic than others. 

The whole/part hierarchicaK organization of complex signs requires that signs be considered 
not only individually but in their context. The immediate context of a sign consists of those . 
other signs that surround it, in space and/or time, and that together with it form a complex 
sign at the next higher level. In numerous linguistic studies, it has been found that the 
context and speaker of a given sentence in a story are at least as important for determining its 
mieaning as are the words that corhprise it. (For an extreme example, the sentence 'Yes' 
could mean almost anything if given an appropriate context.) Generalizing this, we may say 
that it is more useful to view meaning as being produced "top-down" than "bottom-up." (An 
example of the utilitiy of this view is found in artificial intelligence research, where contextual 
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cues have been found to be essential in recognizing and disambiguating signs; this has 
particularly been the true for speech understanding and machine vision projects.) 

It is a common approach to take individual signs as the basic meaning bearing units; however, 
Peirce gave that role to propositions, which are well-formed complexes of individual signs^. 
This has the important advantage that the well-known dependence of sign meaning upon the 
context in which the sign occurs is not a strange phenomenon that needs to be explained, but 
instead follows directly from the way that things are defined, since meaning lies in the context 
rather than in the individual sign. (Notice that this phenomenon can be iterated over several 
hierarchical levels of sign structure.) 

An important aspect of semiosis is the particular sensory means by which a sign is expressed. 
Possible senses include visual, auditory, and kinesic. For convenience, we will also include 
mental events as sensory events; some such move is clearly needed to handle many important 
examples, such as inferring a proposition from one or more others. Of course, a sign may 
involve more than one sensory modality. For example, a telephone conversation involves the 
auditory modality, while a television program involves an audio-visual mix. 

For a given sensory mix, a very large number of different sighs miay be possible; and it may 
also be possible to organize these signs (or subsets of them) in a wide variety of different ways. 
Within a given sensory mix, a given choice of signs and way of organizing them may 
characterize a particular semiotic system (there are, of course, other factors, such as the 
objects and interpretants involved). Notice that a sign that is meaningful in one semiotic 
system may not be in another. For example, different alphabets (such as Roman, Greek and 
Cyrillic) involve different letters, although a given form for a letter may be used in more than 
one alphabet. 

3.3.1 A Formalization of Semiotic Systems 

Having considered the three aspects of semiosis, and the whole/part hierarchy of levels, we 
now consider the structure of entities at a given level. Some of this discussion may be familiar 
from formalisms used in linguistics, but our purpose here is to give a formalism that is 
applicable to any semiotic system whatever. 



Of course, a proposition is also a sign; and in some cases, an individual sign is also a proposition. 
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We have already poted that entities at level n are constructed from entities at level n-1 (and 
other entities at level n that are already so constructed). A given semiotic system admits only 
a certain limited number of ways to put parts together at a given level n; we will refer to 
these as its constructors at level n. In general, there is a classification scheme for the entities 
at a given level (e.g., the parts of speech are such a scheme for the syntactic level of a natural 
language semiotic system), and the constructors at that level can be seen as rules for 
combining entities of these various classes to get a new entity of another certain class. Such 
rules may be written in the form 

r: <cl>...<cn> <c> , 
where <c> is the result class, r is the constructor, and <cl>,...,<cn> are the classes of the 
parts that r puts together. Thus, e=r(el,...,en) is the entity (of class <c>) resulting from 
applying r to entities el,...,en of classes <cl>,...,<cn>, respectively. (Incidentally, the 
<ci> may be classes of entities either from level n or level n-1.) 

A familiar special case is that of a formal context-free grammar, where each <ci> is a 
syntactic class (or "part of speech"), the entities are words, and each rule r is of the form 

e=wO el wl e2 w2 ... en wn , 
where wO,...,wn are fixed words (or strings of words, possibly the empty string); this is more 
conventionally written in the form 

<c> wO <cl> wl <c2> w2...<cn> wn . 
A familiar special case of such a string is 
S NP VP 

for which <c>=S, n=2, wO=wl=w2= the empty string, <cl>=NP and <c2>=VP. 

However, a grammatical formalism that is based on strings cannot be used conveniently foi 
applications like two-dimensional graphical displays, or multi-media presentations such as 
audio-visual animations. This is because string formalisms are not only inherently one 
dimensional^ but are also inherently limited to discrete phenomena, as opposed to phenomena 
that are more naturally viewed as involving continuous variables. Some examples ol 
continuous variables would be pitch and volume in an auditory semiotic system, or size and 



Although two or more dimensions can in principle be encoded in such a formalism, it is unnatural anc 
inconvenient to do so, since no special formal support is provided for this. 
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placement in a graphics system. This is why we have chosen the more general functional 
notation e=r(el,...,en). 

Three slight additions to this basic formalism .>eem useful; there may well be others that we 
have not discovered. First, a given constructor (or rule) r may have, in addition to its formal 
arguments el,.. .en which are entities, some number of parameters pl,...pk, chosen from fixed 
sets of parameter values <pl>,. -,<pk>. Thus, we might write 
r : <pl>,...,<pk>,<cl>,...,<cn> <c> 

and 

For an example of parameters, consider the location of the upper-lefthand corner, and the size 
of a graphic entity, say a cat, to be displayed on a graphics terminal; depending on the values 
of these parameters, the cat will have a different location and size, but will still be 
recognizably the same cat. 

The second addition is that there may be a priority ordering on these constructor functions. 
Under such an ordering, there may be a primary constructor, which has greater priority 
than any other constructor; there may also be one or more secondary constructors, each 
having less priority than the primary constructor and greater priority than any non-primary 
and non-secondary constructor, with none of these having priority over any other; similarly, 
there may be one or more tertiary constructors, etc. Notice that what we have here is a 
partial ordering rather than a total ordering, since given two distinct constructors rl and r2, 
it is not necessary that either one has priority over the other. 

The level of discourse types in the English natural language semiotic system provides some 
nice examples of primary constructors. For example, explanation [Goguen, Weiner & Linde 
81] has a primary constructor, AND, which serves to conjoin a number of reasons for the same 
statement. The argument that AND is a primary constructor is simply that it is so basic to 
the explanation discourse type that explicit textual markers for it can often be omitted 
without obscuring the meaning. Several other discourse types are also known *o have such 
■default" constructors [Lind? & Goguen 80]; thus, these also have primary constructors. 
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The third addition is that of context conditions on rules. These are conditions (i.e., 
predicates) that limit the applicability of a rule to certain particular contexts, namely those 
where the predicate is true; these predicates may involve the arguments el,...,en and also the 
parameters pl,...,pk. Context conditions often express constraints that arise as a result of 
structure at higher levels of the hierarchy, and are a feature of many recent grammatical 
formalisms [Kaplan & Bresnan 82], [Wasow et al. 82]. 

The predicates that can be used for expressing constraints are also a significant part of a 
semiotic system. At a given level, there will be only a finite number of basic predicates; others 
can be formed as simple logical combinations of these. (Here we will rely on the conventions 
of ordinary logic, instead of creating a special grammar of predicates for a given semiotic 
system; but note that higher order logic, and other extensions of first order logic, may be 
needed.) For example, in a semiotic system for graphics, we may have predicates like RED 
and SQUARE. Note that predicates can also be constructed using fHinctions like COLOR, 
SIZE and BRIGHTNESS. These associated predicates and functions express basic properties 
of entities at a given level of the semiotic hierarchy. Like constructor functions, their 
arguments may be restricted to particular classes of entities at their level. We will use the 
notation 

p: <cl>...<cn> 

to indicate that p is a predicate having n arguments, where the i^** argument must lie in class 
<ci>. p(el,...,en) is thus either true, false, or undefined, and is well-formed only if ei is of 
class <ci> for i=l,...,n. Functions may also have their arguments restricted to particular 
classes in the same way. 

It should be noted that there is a standard way to reduce functions to relations. For example, 
we may represent a real-valued function f(el,e2) as a relation 

F: <cl><c2><real> ' 
where <cl>, <c2> refer to the classes of el, e2 respectively. Then f(el,e2)=r if and only if 
F(el,e2,r)=true. We can also consider arguments md/or values that are not necessarily 
entities; non-entity arguments correspond to parameters, and may be restricted to specific 
parameter sets. For example, g(pl,p2,el,e2) corresponds to the relation 

7 

This is a limited and technical sense of "context.* A broader sense has been given earlier; a still broader 
sense would take account of pragmatics. . 
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G: <pl><p2><cl><c2><real> 
where <pl>, <p2> refer to the parameter sets of pi, p2, and <cl>, <c2> refer to the 
classes of el, e2 respectively. Then g(pl,p2,el,e2)=r if and only if G(pl,p2,el,e2,r)=true. 
Also note that higher order relations can play a significant role in some cases; in fact, the 
priority ordering on constructors is such a higher order relation. 

The systematic construction of new entities from parts, either at the same level or at the next 
lower level, provides another kind of hierarchical structure, that of entities at a given level. 
From another perspective, one may speak of analyzing a given entity in terms of other entities 
at the same or lower levels. A familiar example from high school English is diagramming the 
syntactic structure of a sentence. Such a diagram shows the division of a sentence into parts, 
called "phrases," and moreover explicitly shows the relationships of subordination among 
these phrases; that is, it shows which phrases are sub-phrases of other phrases, and also what 
class they belong to. (Examples of classes are "noun," "verb phrase," "prepositional phrase," 
etc.) 

There are many different systems of notation for the analysis of senteuce structure and for 
similar structured entities in other semiotic systems, most of which rely upon special 
conventions suited to the case at hand. Our intention here is to give a uniform tree notation 
that applies to any semiotic system having constructor functions as described above. If 
e=r(el,...,en), then we shall represent e as a tree with root node labelled r, where r has n 
branches corresponding to el,...,en, each of which is either a subtree (constructed in the same 
way recursively) or else is an entity of the next lower level. In some cases, it may also be 
helpful to label edges with the classes to which they correspond, e.g., the label of the edge 
from the root to ei might be labelled <ci>. For example, the sentence "The light on the left 
comes on" can be' diagrammed as in Figure 6. Such diagrams can also be decorated with 
parameters as subscripts to node names (or in parentheses after node names). We note again 
that the purpose of such diagrams is to show the internal part/subpart structure of a given 
entity at a given level within a given semiotic system. It is also worth noting that such an 
abstract "parse tree" of a sign gives an ordering to the entities which comprise it, namely the 
left-to-right ordering of the "frontier" (i.e., the leaf nodes) of the tree. We call this ordering 
the intrinstic ordering of these parts. 
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Figure 6: Part/Subpart Hierarchy for a Compound Sign 



Many aspects of this approach to the strvrcture of signs seem to be present or implicit in 
Peirce's treatment of the semiotics of prop^itions. But, as far as we* can tcl/, these 
considerations were never explicitly assembled into a single definition. Our purpose in doing 
so below is to make as precise and explicit as possibl^vwhat is involved in constructing (or in 
attempting to construct) optimal explanations or other representations of instructional 
material. (This application is considered in more detail in the Section 3.4.) 
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The basic insight underlying this definition of semiotic system is that semiotic events are not 
isolated phenomena, but rather occur in systems: there are common rules relating to the 
recognition, construction, denotation and interpretation of such aXcollection of signs. 
Moreover, semiotics as a subject is (or should be) more concerned with such rules than with 
the comparative study of individual signs and the settings in which they are\found. (This 
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distinction is like the distinction between descriptive biology and modern biology that is based 
on biochemistry and molecular biology.) We repeat that Definition 1 mecely embodies our 
current understanding of the structural elements that are involved in semiotics and can be 
expected to change as that understanding improves. 

Definition 1: A semiotic system consists of four classes of entities: 

1. Signs, 

2. Objects, and 

3. Interpretants, 

such that each class of entities (except the first) is divided into levels (not necessarily 
disjoint), sortie of which may be more basic than others, such that entities at level n+1 are 
constructed from entities at level n (and other entities at level n+1) by use of a fixed set of 
constructor functions (which may also have parameters and context conditions). In 
addition, there may be a priority (partial) ordering on these functions at each level. 
Finally, there may be predicates, relations and functions expressing properties of entities at 
each level of each stage. [] 

We may illustrate the concepts in this definition with examples from the semiotic system of 
spoken English. The underlying medium is sound, that is, physical vibrations. The signs are 
classed into the usual levels of spoken English, phonemes, morphemes, words, phrases, 
sentences, and discourse units. Constructors at the sentence level are given by rules, as 
previously described. Objects and interpretants are more problematic; it seems fair to say 
that it is the objective of current research in Cognitive Science iand Artificial Intelligence to 
construct suitable entities for these classes, and to write programs for processing them. 

The purpose of this definition is to explicitly describe the structure of a system of related 
signs, in order to facilitate the construction of good representations of signs from one system 
by signs from another system. The next subsection addresses such representations. 

3.4 Semiotic Morphisms 

This subsection focusses on our primary concern in semiotics, which is the translation of signs 
in one system to signs in another system. It is our intention to provide the theoretical 
background for a general theory of the construction and interpretation of signs. For example, 
generating an optimal (or at least reasonably good) explanation, generating appropriate 
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graphical icons, choosing a good file name, choosing a good analogy, and understanding texts 
and/or graphics, can all be seen as problems of translating signs from one sign system into 
another. Notice that the problem of choosing an optimal mix of media also falls in this 
framework, since we can regard the signs of each media mixture as forming a subsystem of the 
total sign system within which we must choose representations. This subsection addresses 
general questions about the nature of translations between sign systems, and the reasons for 
preferring one translation to another. In order to formulate such questions with sufficient 
jgenerality, we first introduce another basic concept, that of a sign system. 

Definition 2: A sign system is a class of entities, called signs, divided into a set of levels 
(numbered 1 to N and not necessarily disjoint), some of which may be more basic than others, 
such that entities at level i+1 (for l<i<N) are constructed from entities at level i (and other 
entities at level i+1) by use of a fixed set of constructor functions (these may also have 
parameters and context conditions). In addition, there may be a priority (partial) ordering 
on these constructors at each level. Finally, for each level, there may be predicates, relations 
and functions expressing properties of signs at that level. [] 

Artificial systems often exhibit the structures in this definition in a very natural way. For 
example, let us consider a simple line-oriented editor for a standard 24 line by 80 character 
screen. The lowest hierarchical level is that of characters, the second that of lines, the third 
that of screenfulls, and the last that of sequences of screens; thus, entities at the second level 
consist of strings of 80 or fewer characters, and entities at the third level consist of strings of 
24 or fewer lines. We can see this simple system as having just one constructor at each level 
greater than 1, namely string-of(al, aN, N) with parameter N, which "strings together" N 
entities at the next lower level. The second level has the context condition 0<N<80, and the 
third 0<N<24. Since there is at most one constructor at each level, the priority ordering is 
trivial. However, there are some interesting predicates and functions, such as the 
LINELENGTH function for lines, and the ALPHANUMERIC predicate for characters. 

A more sophisticated editor, specifically oriented toward text editing, might have character, 
word, sentence, and paragraph among its levels. It might have one sentence level constructor 
for each possible final punctuation, e.g., SENT.(al, aN, N), SENT?(al, aN, N), 
SENT!(al, aN, N), each with parameter N and context condition N<0. Here SENT, 
clearly has priority over SENT? and SENT!. (Note that many editors in current use provide 
both line-oriented and sentence-oriented commands.) 
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We can also illustrate Definition 2 in the domain of computer graphics. Example entities are 
lines, characters, circles and squares. Levels might consist of pixels (individual "dots" on the 
screen), lines, simple figures, and windows (consisting of arbitrary "scenes," collections of 
entities at lower levels, plus other windows); each entity at each level must also have 
associated attributes for location on the screen, and size; there may also be attributes for color 
and intensity. The most interesting constructor here is WINDOW, which can encapsulate any 
collection of entities from any levels. 

The classes of signs, objects and interpretants involved in a semiotic system each form a sign 
system, as follows directly by comparing the above definition with that of semiotic system. 



Now t^e main concept of this section, which is intended to capture the notion of mapping 



signs Jfrom one system to representations as signs in another system. Such a mapping may or 



5. Property Predicates and Functions of 51 at Level i Property Predicates and 
Functions of 52 at Level M(i), 



1. if i<j (for l<i,j<N, where N is the number of levels of 5l) then M(i)<M(j), 

2. if r: <cl>...<cn> <c> is a constructor at level i of 5l, then M(r): 
M(<cl>)...M(<cn>) — ► M(<c>) is a constructor at level M(i) of 52 (if it is defined), 
and 

3. if p: <cl>...<cn> is a predicate at level i of 5l, then M(p): M(<cl>)...M(<cn>) is 
a predicate at level M(i) of 52 (if it is defined)^. 



As in the previous subsection, by translating functions to relations, this condition applies to property 
functions as well, and a slight generalization also permits translating arguments and/or values that are not 




Definition 3: Let 51 and 52 be sign systems. Then a (semiotic) morphism M: 51 52, 




such that 



entities. 
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We will say that M preserves entity e at level i (for l<i<N, where N is the number of 
levels of 51) if M(i) is defined and M(e) is at level M(i) in S2. Then M preserves level i if M 
preserves all entities at level i (for which it is defined), and M is level preserving if it 
preserves all levels of 51 (for which it is defined). 

We say that M preserves constructor r: <cl>...<cn> <c> (at level i) for entities 
el,...,en if r(el,...,en) is defined, if M(r)(M(el),...,M(en)) is defined, and if it equals 
M(r(el,...,en)). Then M preserves constructor r if it preserves r at all entities for which r is 
defined; and M preserves constructors (at level i) if it preserves all constructors (at level i- 
for which 't is defined) of 5l. 

If r and r^ are constructor functions of 5 and r>r' (r has priority r') in 51, then we say that M 
preserves the priority of r over r' if M(r)>M(r') in 52, provided that M(r) and M(r') are 
defined. M is priority preserving if it preserves all priorities in 5l (for entities where it is 
defined). 

Next, we say that M preserves a property relation p: <cl>...<cn> of 5l provided that 
M(p) is defined and M(p)(M(el),...,M(en)) holds whenever p(el,...,en) holds, for ei of class 
<ci> in 5l^. Also, M is property preserving if it preserves all properties of 51 (for which 
it is defined). 

Finally, we say that M is structure preserving if it is level preserving, constructor 
preserving, priority preserving, and property preserving. () 

These careful distinctions about what kind of structure might be preserved by sign 
representations from one system with signs fr*m another will be used in Section 6 to formulate 
precise experimental hypotheses about the ality of representations. 

It is important to notice that sen/ Jcc morphisms need notice totally defined; that is, each of 
the functions denoted M can be jnd<*fined on some of what is in 51. For example, there need 
not be any representation in 52 for some entities in 51; in particular, some components of M 
could even be totally undefined, i.e., the empty function. 

An example of a semiotic morphism is the correspondance between the physical order of lights 
on the box, and the order in which clauses are given to describe the lights (narrative order). 

Q ^This extends to functions and to non-entity arguments and/or values as be^ 
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The processes of signification and interpretation (i.e., of constructing objects and interpretants 
within a semiolic system) might be viewed in the light of scmiotic morphisms, since the 
entities that they map from and to are both sign systems. This only makes sense- because 
semiotic morphisms can be partial functions. For example, it is often the case that low level 
signs in complex systems, such as phonemes in the English natural language semiotic system, 
seem not to have either denotations or interpretations. Moreover, there is very little structure 
other than sequential succession to preserve at these levgjs. 

It seems clear that a structure preserving semiotic morphism M: Si — ► S2 will faithfully 
represent all of the semiotic structure of Si in terms of that available in S2. This would seem 
to be desirable for an optimal repr**«?entaton; however, if the resulting structures in Si are too 
complex, then they may be hare ! >r fmman beings to understand, and thus not really optimal. 
For example, if Si consists of parse trees for English sentences and S2 consists of the usual 
•printed page* text format, then it is possible to translate all the syntactic information that is 
available in Si into structures in S2 with so-called "phrase structure* notation, which uses 
brackets to delimit phrases and uses subscripts on brackets to indicate the class of phrase is 
involved. For example, the sentence given previously would be represented by something like 

I[[[the]p^^ [light]j^]l^[[onjp^^p[[the]p^Jleft]pj|p^]pp]j^|[^^^ 
in this notation. This may be useful for some purposes, but it is clearly not optimal for all 
purposes. The point to be noted is that there is some kind of a trade-off betweep^the degree 
of structure preservation and the degree of complexity of the resulting representations. 

We now turn to the rather delicate issue of determining whether one representation (i.e., 
semiotic morphism) is better than another. One evident consideration is whether it preserves 
more structure than the other (of course, this will make it better only if the complexity of its 
representations are not too great). 

Derinition 4: Let M' and M be morphisms from sign system Si to sign system S2. Then \f 
preserves more structure than M does, provided that: 

1. if M preserves an entity e at level i, then so does \f; 

2. if M preserves a constructor r at entities el,...,en, then so does M'; 

3. if M preserves a priority r>r,' then so does M'; and 

4. if M preserves a property p, then so does M'. [] 
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The real difficulties arise in trying to compare morphisms M and. M' such that neither 
preserves strictly more structure than the other, or for which one preserves more structure but 
also produces more complex representations. For example, M might preserve more levels than 
M', whereas M' preserves more properties than M. It is for unclear cases such as this that our 
future exjperimental results will be especially interesting. The framework that we have 
developed suggests that preserving levels is more basic than preserving priorities, which is 
more basic than preserving properties. It is not difficult to formulate a number of specific 
experimental hypotheses that will test these suggestions, and we hope this will eventually lead 
to a workable notion of what it means to be a good representation. "Workable" here means 
that it will be possible to effectively, determine of a given representation whether or not it is 
adequate to the task in hand. More optimistically, given sign systems Si and S2, where Si 
contains abstract forms of the information to be conveyed, it may be possible to discover (to 
"compute" even) a semiotic morphism M from Si to S2 that will give adequate 
representations in S2 for entities from Si. For example. Si might contain instructions for 
repairing some piece of equipment, and S2 might be a color graphics terminal. The problem is 
then to generate displays that utilize the capabilities of that particular terminal reasonably 
well. 

Similar considerations arise in [Centner 83]'s discussion of successful and unsuccessful natural 

language analogies. We now quote from the summary of that paper: 

The structure-mapping theory describes the implicit interpretation rules of 
analogy. The central claims of the theory are that analogy is characterized by the 
mapping of relations between objects, rather than attributes of objects, from base to 
target; and, further, that the particular relations mapped are those that are 
dominated by higher-order relations that belong to the mapping (the systematicity 
claim). These rules have the desirable property that they depend only on syntactic 
properties of the knowledge representation, and not on the specific content of the 
domain. 

Our approach introduces a finer structure on the source and target domains, and thus permits 
finer hypotheses about what makes analogies good. In addition, our approach is not limited to 
natural language as the target sign system, and considers representations other than analogies. 

We now introduce the notion of a subsystem of a sign system. This notion has already been 
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used informally in the discussion of choice of media mix at the beginning of this subsection, 
and will be used again in the next subsection. 

Definition 5: Given sign systems 5 and S*, we will say that is a subsystem of S provided 
that: every level of St is also a level of S; every entity of S' is also an entity of S and entities in 
Sf have the same level in 5^ as they do in S; every property function and predicate in S* is also 
one in S, and has the same values in as in S; every constructor function of S* is also a 
construtor of S, and constructors in Si yield the same results in as they do in S, and also 
have the same parameters and context conditions; and finally, the ordering on the constructors 
of S' is the same as that on those constructors in 5. [] , 

Now suppose that we are given sign systems Si and 32 and a semiotic morphism M from 51 to 
52. Then the set of entities M(e) in 52 for which M is defined for some entity e in 51 has: a 
set of levels, inherited from those of 52; a set of constructors, also inherited from those of 52 
(but they will have to be undefined whenever combining entities of the form M(e) fails to yield 
another of the same form in 52); a priority ordering on these constructor functions, namely the 
same one that 52 has; and also the functions and predicates from 52, now thought of as 
expressing properties of entities of the form M(e) for e in 51. In short, the entities of the form 
M(e) form a subsystem of the sign system 52; we call it the ima^^e subsystem of the semiotic 
morphism M. 

3.4.1 Iconicity and Naturalness 

As discussed in Section 3.1, semiotics distinguishes three types of sign — the index, the icon, 
and the symbol. The symbol is fully arbitrary, in the Saussurean sense. The index as signifier 
is a necessary (or probable) concomittant of the signified — smoke as an index of fire. It is the 
icon which poses the most interesting questions for the relation of signifier and signified. The 
accepted definition of the icon is that it involves an actual resemblance between signifier and 
signified; a portrait signifies its subject by resemblance, not by convention. (Compare, for 
example, a highly conventionalized political caricature.) 

This definition implies a specific directionality to the relation between signifier (52 in 
Definitions 3 and 4) and signified (5l), in which the signified is more natural or basic than the 
signifier in some sense. Thus, it is often assumed that a diagram, drawing, or visual icon is 
more basic, easier to comprehend, and freer of the arbitrary conventions of language. (We 
find a folk theory of iconicity in the proverb: "A picture is worth a thousand words.") 
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However, it is important to note that pictures, diagrams, etc., are only partially iconic in this 
sense, and also contain a component of conventional representation that must be learned. For 
example, Venn diagrams may appear fully iconic of boolean relations to someone accustomed 
to using them; but to someone who has not learned the aspect of convention in this only 
partially iconic semiotic system, a Venn diagram may be iconic only of a pretzel or symbolic of 
a brand of beer. 

In terms of the theoretical apparatus introduced in Section 3.4, the problem of generating an 
iconic representation is one of constructing a semiotic morphism from one sign system to 
another. The considerat'ons of the above paragraph suggest that the set of all representations 
that are so generated, viewed as a sign system (this is the image subsystem of the full system 
of possible representing signs) should be in some sense simpler, more natural, or more basic (to 
humans) than the original set of signs. It is hoped that experimental explorations along these 
lines will lead to a deeper understanding of iconicity. 

4 Some Further Analytic Concepts 

We now consider some additional concepts used to formulate the variables and hypotheses in 
Sections 5 and 6. We begin with the possible cognitive structures of the task domain, and 
then consider some linguistic issues, including syntactic placement and strength, focus, and 
indexing . 

4.1 Cognitive Structure of the Task Domain 

The task of explaining the logic box (Figure 1), can be fulfilled by accounts based on at least 

five different ways of understanding the task domain: 

1. Behavioral. This is a simple, unanalyzed description that matches a pattern of lights to 
corresponding switch positions. Loosely speaking, an account at this level sounds like a 
description rather than an explanation. _ 

2. Combinatorial. An account at this level considers the possible' patterns of lights as an 
aggregate. This might be displayed in a table, such as a truth table of the relation of 
switch positions and lights. An optional addition at this level explains that only four 
combinations of switch position are possible, by simple multiplication of two switches 
times two switch positions. 

3. Logical. Such an account would indicate the Boolean functions of the switch positions 
represented by each light, using primitive functions like AND, OR, and IF. Depending 
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on the background of the audience, accounts at this level would differ in how full an 
explanation of logical functions is required. 

4. Electronic. An account at this level might use a circuit diagram to indicate how the 
^ relation between switch position and light patterns is accomplished. 

5. Physical. An account at this level would utilize the principles of physics and chemistry 
underlying the previous level of electronics. 

The first three of these form a hierarchy by levels of abstraction, while the first, fourth and 
fifth form a hierarchy by levels of reduction. We have found instances of the first three of 
these levels in instructors' explanations, and the fourth in their briefings and debriefings; tjtie 
fifth was included for completeness. 

This categorization of possible cognitive organizations of the task domain relates to work by a 
number of researchers on the cognitive organization of explanations. For example, (Kieras 82j 
distinguishes knowledge of what a device is for, how to operate it, and how it works. The first 
two levels of description of our task correspond to varying degrees of knowledge of the first 
type: how to operate the device. The third, fourth, and fifth levels correspond to varying 
degrees of knowledge of how the device works. We note that a description at any of these five 
levels may function as an explanation, depending on the purpose of the descriptio^i and the 
existing level of understanding of the audience. Similarly, in a study of Navy instruction 
manuals, [Stevens & Steinberg 81] provide a typology of explanations beginning at the 
behavioral level and proceeding to more abstract forms of explanations. (No exact 
correspondence between the higher levels of their taxonomy and ours is possible, since ours is 
a simple, non-branching tree structure, while theirs is a matrix of four two-way distinctions. 

The first round of experiments gave instructors highly nondirective instructions, telling them 
to teach students how to check whether the device was doing what it was supposed to. This 
produced explanations of types one and two. Interestingly, although the audience was a group 
of community college students having no background in mathematics or electronics, many 
students found these explanations unsatisfactory, and in the subsequent debriefing session 
requested further information. Examples of such comments are: 

(6) The thing is so easy to understand* I mean* it's* that we 
look for the complications* you know* we're trying to 
look for something that you know* what is* what's there* 
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and wa have to really explain it and you haveta get 

into ... 

(7) I know. It's hard to explain because it's so easy. 

These comments strongly suggest that the cognitive structure underlying an explanation is an 
important variable for comprehension. Our later experiments elicited explanations at other 
cognitive levels, by asking the instructor to present the device as the control panel of a set of 
sluices, where switches controlling gates, and the lights indicating whether or not the sluices 
are open; this required students to understand not only the current relations but the basis 
underlying any possible set of relations. 

In our first set of pilot experiments, the comprehension task (for the audience) was to write an 
explanation of the box, on the basis of the explanation given by the instructor. A similar 
procedure can be used to test comprehension of any of the cognitive organizations listed above 
(of course, the writing skills of the subjects will also effect such a measure). Our Phase 11 
experiments will use more focused test questions to probe particular c .;nitive organizations. 
Thus, a question such as Can all four lights be on at once? can be answered from 
simple observation at the behavioral level. In contrast, a question like Does any light 
correspond to the logical function ((not Switch 1) or (not Switch 2))? requires 
some comprehension at the third level, that of Boolean functions. 

4.2 Sentential Syntax 

A number of issues at the syntactic and lexical levels appear to alfect the comprehensibility of 
explanations. These include the syntactic placement of information and the strength of 
structural markers. Variables at this level may not be highly trainable for a human instructor 
delivering a non-scripted explanation, but may be extremely valuable in scripting computer 
produced or videotaped explanations. 

4.2.1 Syntactic Placement of Information 

In a semantically restricted domain like that of the present study, it is possible to examine the 
syntactic placement of information quite precisely. This is valuable because syntactic 
placement is an important organizational device for discourse that allows the analyst to 
determine what information is given major importance and what information is given minor 
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importance. Our task domain involves three basic types of information: information about the 
switches, information about the lights, and information about relations between the two. 

Linguistic research has shown that there is a continuum of syntactic constituents ranging from 
the maximally sentence-like on downward. It has been further shown that more important 
information is, the more it is likely to be placed in sentence-like syntactic constituents [Linde 
74, Ross 73). To aid comprehensibility, it appears that important information should be 
placed in syntactically heavy constituents, that is, in constituents that are quite sentence-like. 
Similarly, semantically parallel information should be placed in syntactically parallel units. 

4.2.2 Strength of Structural Markers 

Structural markers are pieces of text that invoke internal nodes of the discourse structure tree, 
such as STATEMENT/REASON, IF/THEN, and EXAMPLE, or that indicate movement in 
the tree. Our formal theory of discourse structure states that the first of these indicates 
relations between pieces of information, while the second type indicates change of focus of 
attention. Text invoking these markers may do so with varying degrees of strength. There 
are a number of dimensions that combine to produce strength of markers, including weight of 
the syntactic placement of the marker, degree of semantic ambiguity or univocality, and 
length in words. 

4.3 Focus 

To describe the semantics of these explanations, the additional notion of focus is required. 
The logic box employed in our explanation task has both lights and switches, and the patterns 
of ?ach may vary. A coherent description must focus on one of these, describing the other in 
terms of the item in focus. (8) shows a focus on lights, while (9) shows a focus oh switches. 

(8) What happens is each of those lights is a logical 
function, which means you know, true and false, or 
yes and no, of the two switches. So, for instance, 
this light C is on, just depends on Switch Two. Now 
whenever Two is up, C is on. It doesn't matter what 
One is on. 

(9) We'll go through it again slowly. O.K. In one 
position of these two switches, where One is up and 
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Two is down, all these lights are off. We take 

Switch One and move it to its other position, three 

of the lights come on. We reverse this combination 

and make them both go up, we got again three 

of the lights are on but a different three lights. 

And, if we move em till both down, again, 

we got three lights on but these two lights are 

are changed over. 

Issues involving focus include the question of whether there is an optimal focus for a given 
task, and the effects of maintaining or switching focus. Preliminary results suggest that 
changes of focus are confusing and that poor placement within the explanation structure can 
make them even more confusing; (10) is an example of such a change. 

(10) So, in a condition where they're all off, this switch. 
Two, is down, and this switch. One, is up, and if we 
change the position of just oxre switch, we'll change the 
condition of the lights. So we'll go from all off to 
three of these lights going on and the three lights that 
come on are A, B, and D. If we go back to this situation 
which is where we started, the^^ 11 all go back off again. 

This explanation begins with a focus on the switches and changes in the middle to a focus on 
the lights. 

The taxonomy of explanation types given ii , & Steinberg 81] contains a number of 
distinctions that correspond to this notion of focus; these are distinctions at the same level of 
abstraction, such as a "stuff-state-attribute* description of a physical system, versus^ a 'stuff- 
as- a-transport medium' description of the same system. 

4.4 Prior Text Reference 

To understand explanations (or indeed any discourse type) we must take a^jcount not only of 
the linguistic form of the explanation and its semantic structure, but also of the knowledge 
shared by its speaker and addressees. In recent years, the fields of cognitive science, 
linguistics, and artifical intelligence have all been concerned with "the effects of shared 
knowledge on linguistic and cognitive structures. The present discussion is concerned with the 
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linguistic forms that speakers use to indicate that a particular body of shared knowledge is 
relevant or necessary in order to understand a given explanation. 

As we define it, a prior text reference is a pointer in a given text to some body of 
information not present in that text, or to some prior text that speaker and audience are 
presumed to share. This definition derives from the discussion in [Becker 81] of what they 
term indexing of prior text. We have changed the term to avoid confusion with Peirce's 
related but different sense of the term index. Prior text reference can be accomplished 
explicitly, as in As we were discussing last week about circuits, and it may can 
accomplished implicitly, as in Does this have to do with circuitese? For a given prior 
text reference to succeed, it must indicate a body of knowledge or a prior text that the 
audience actually has mastered. Thus, the reference If jou remember the commutative 
law from high school algebra will succeed only if the audience remembers the 
commutative law from high school algebra. Another example from our data is the statement 
that the operation of the logic box is like a set of traffic lights. This will succeed 
only if the audience does in fact know enough about how traffic lights work. Section 6.1 gives 
hypotheses about prior text reference. 

5 Variables of Interest 

This seetioti discusses some variables applicable to our data that appear to be important for 
the comprehensibiiity of instruction; these variables are used in the hypotheses of the following 
section- We expeol that further variables will be found as the research progresses. 

5.1 Discourse Level Semantic Variables 

5.1.1 Cognitive Structure of thcts Task Domain 

As discussed in Section 4.1, explanations can be based five different cognitive organizations of 
the task domain: behavioral, combinatorial, logipal, electronic, or physical. Similarly, the 
comprehension task can probe any of these. 
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5.1.2 Focus 

An explanation can be based either Ou the condition of the lights or on the positions of the 
switches, ti eating the other as functionally dependent on the one chosen as basic. Similarly, 
the comprehension task can be based either on the lights or on the switches. 

6.1.3 Prior Text Reference 

Prior text reference, as defined in Section 4.4, may be present or absent in any given part of 
an explanation. 

6.1.4 Form of Prior Text Reference 

Prior text reference may be either explicit or inferential. Boolean algebra tells ub ... is 
an explicit prior reference. Is that circuitese? is inferential. 

6.2 Variables of Discourse Structure 

This subsection uses the distinction between discourse type and discourse unit introduced in 
Section 2.1. 

6.2.1 Discourse Type 

An explanation may consist of any of several discourse types, including reasoning, narrative, 
and pseudonarrative. 

6.2.2 Number of Discourse Units 

An explanation may contain one or more instances of each of its discourse types, and it may 
consist of several different discourse types. For example, a single explanation may consist of 
three reasoning units, it may consist of one reasoning unit and two pseudonarrative units, etc. 

5.2.3 Presence or Absence of Discourse Summary 

Any of the discourse types that have been found to perform the function of explanation may 
have as part of their structure an optional summary, giving an overview of the entire 
explanation, or of some part of it. 
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6.2.4 Placement of Summary 

Within a given discourse unit, a summary may be placed at the beginning, at the end, or may 
be embedded within the discourse unit. (This distinction between an initial and a final 
summary is related to the distinction commonly made in rhetoric between deductive and 
inductive paragraph structure. A deductive structure places the topic sentence at tlye 
beginning of the paragraph and follows it with supporting material; an inductive structure 
begins with cases or examples building up to a final general statement.) > 

5.2.5 Presence or Absence of Explicit Structural Markers 

In t;he construction of discourse structure trees, transformations can establish internal nodes in 
the tree, and can also alter the focus of attention within the tree. These functions may be 
accomplished explicitly, by separate pieces of text, or they may be accomplished implicitly, as 
part of the semantics of text primarily devoted to content. Such implicit markers depend on 
the fact that each discourse type has a characteristic default node type. For example, in a 
narrativo, the\iefault node type is SEQ, corresponding to the narrative presupposition, the 
rule of interpretation stating that events are assumed to have occurred in the same order as 
the main clauses that refer to them. Thus, in a narrative, it is sufficient to say He moved the 
second switch. Two lights went out. It is possible, but not necessary to add a marker 
such as and then, getting He moved the second switch and then two lights went out. 

5.2.6 Strength of Marker 

In the case where an explicit structural marker is present, we may ask how strongly the 
marker is indicated. Strength of indication depenc^ on a number of factors: J 

1. Syntactic heaviness of the marker, i.e., whether it is a single conjunction, a phrase, a 
dependent clause, or a sentence. 

2. Length in words of the marker. (This is related but not identical to 1.) 

3. Explicitness of the marker. For example, a marker like so is quite inexplicit, and may 
indicate causality, simple sequence, resumption of a previous topic, etc. In contrast, a 
marker like because of that indicates causality explicitly and unambigously. 

4. Position in the sentence. A position at the beginning of the sentence is heavier t^aan a 
later one; there are a number of syntactic devices in English that can move a constituent 
to front position. 
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6«2.? Penetrance of Discourse Tree 

Different choices of node type and different orderings of subordinate nodes (which correspond 
to different embeddings of clauses) can lead to a variety of different tree ^hapes. In general, 
discourse trees can be described as fundamentally deep structures, or broad structures. A 
relevant variable, related to a quantity called penetrance in artifical intelligence [Nilsson 71], 
is the ratio of the average path length A to the total number N of nodes. Thus, P = A/N = 
i7j_jL./TN, where T is the number of paths and Lj is the length of the ith path, since A = 
-^jL-iLj/T. P is larger for deep trees and smaller for broad or shallow trees. 

6*2.8 Explicit Establishment of the Basis of Parallel Structure 

Many of the explanations in our data contain parallel structures. For example, there may be 
four subtrees correponding to the four lights. Or, in a differently organized explanation, there 
may be four subtrees corresponding to the four possible switch positions. A variable of 
interest is whether or not the basis of this parallel structure is made explicit. This could be 
done by a reference to the existence of the four lights in the first case, or to the simple 
computation of two switches times two switch positions in the second. Because of the 
existence of the demonstration prop, if the focus is on the lights, the explicit establishment of 
the four lights can be accomplished simply by directing attention to the box itself. It seems 
important that parallel structures in the text be ordered in a way that clearly corresponds to 
the geometry of the physical world in such a case. Such an iconicity between visual and 
linguistic representations would be particularly important in generating graphics in computer 
aided instruction. 

5.3 Sentential Level Variables 

5.3.1 Syntactic Placement of Information 

As discussed in Section 4.2, the syntactic form of the constituent in which information is 
placed is one indication of its presumed importance. This variable appears in a number of 
hypotheses. 
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6 Hypotheses ^ 

As already noted, Phase 11 of this project will test the most interesting hypotheses suggested 
during the analysis of Phase I data, by presenting experimentally varied explanations (of the 
logic box task) to learners. Once the hypotheses have been selected and the test explanations 
constructed, we will develop suitable dependent measures of comprehension (verbal and/or 
behavioral). Subjects will be randomly assigned to experimental conditions so that, although 
subject knowledge and education may have some influence, it will not be specific to any 
condition. Thus, hypotheses about explanation effectiveness can be tested in terms of student 
comprehension. 

This section gives some candidate hypotheses found so far. We expect that the statement of 
these hypotheses will be refined during the process of testing; i.e., that many of these 
hypotheses will not be tested in exactly the form given here, and some will not be tested at all. 
However, they all seem to represent reasonable intuitions about instructional processes and 
valuable directions for further investigation. The hypotheses to be tested will be selected 
according to the following criteria: 

1. Potential application of the hypothesis to the improvement of human or computer based 
instruction. 

2. Possibility of incorporating and varying the variables of interest in instructional 
discourse. 

3. Possibility of accurate measurement of the degree of variation. 

4. Possibility of holding other variables constant. 

Notice that dependent measures can be constructed in at least three ways: (1) performance on 
physical tasks involving actual use of the logic box; (2) performance on comprehension tasks, 
such as multiple choice questions; and (3) performance on the task of generating an 
explanation based on that given by the instructor. Tasks of type (2) and (3) can be aimed at 
any of the five levels given in Section 4.1, but tasks of type (3) would be difficult to score. 
Now the hypotheses, subdivided into three main categories. 
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6.1 Discourse Level Semantic Hypotheses 

1. Explanations at the combinatorial and logical levels will result in superior 
comprehension. (Explanations at the behavioral level will result in inferior 
comprehension because their cognitive structure is relatively simple and therefore they 
must rely on rote memory; explanations at the electronic or physical levels will result in 
inferior comprehension because they are too abstract for most learners. Note that this 
assumes dependent measures based on performance of tasks at the intermediate levels, 
since explanations whose cognitive level matches that of the comprehension instrument 
will get the best scores.) 

2. Comprehension will be impaired if the focus changes among the direct subordinates of 
an AND, OR or SEQ node. - 

3. Comprehension will be assisted if the focus of a summary corresponds to the focus of the 
nodes that are being summarized. 

4. Explanations based on serniotic morphisms that preserve the level structure, especially 
the basic levels (if there are any), will result in superior performance to morphisms that 
do not. _ 

5. Explanations based on semiotic morphisms that preserve primary constructors (if tliere 
are any) /will result in superior performance to explanations based on morphisms that do 
not. 

6. Explanations based on semiotic morphisms that preserve properties at thf* expense of 
basic levels or primary constructors will produce inferior performance to explanations 
based on morphisms that preserve basic levels or primary constructors at the expense of 
properties. 

7. Comprehension will be assisted by the presence of prior text reference. 

6.2 Hypotheses at the Level of Discourse Structure 

8. Comprehension will be assisted by the inclusion of summaries. 

9. Comprehension will be assisted more by initial placement of summaries than by medial 
or final placement. 

10. Comprehension will be assisted more by a broad tree than by a deep tree. A more 
precise formulation of this hypothesis is that structures with larger penetrance will be 
more easily comprehended than structures with small penetrance. 

11. Comprehension will be assisted by explicit markers of discourse structure; these may be 
either verbal or visual. 

12. Comprehension will be hampered by interruption of parallel structures, even if the 
interruption represents a summary. 
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13. Complexity of linguistic structure and strength of marking tend to attenuate within a 
parallel structure, and the greater the number of parallel items, the greater the degree of 
attenuation. Comprehension will by assisted by or unaffected by such attenuation of 
structural marking if the basis of the parallelism has been comprehended by the 
audience. 

14. For novices compreL»jnsion will be superior with pseudonarrative structure rather than 
reasoning structure. (Note that all of the subjects will be novices.) 

15. Comprehension will be superior if the focus and level of a summary correspond to the 
focus and level of the structure being summarized. 

6.3 Hypotheses at the Sentential Level 

16. Comprehension will be assisted if the strength of a POP marker is proportional to the 
size of the movement in the tree that it accomplishes. 

17. Comprehension will be assisted by parallel syntactic placement of semantically parallel 
information units. 

' 18. Comprehension will be assisted if information that y> structurally important is placed in 
syntactically heavy constituents. 
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I. Method 

This appendix brings together various discussions from the main text on the methodology of 
Phase I of the project, including the order of experimental tasks, selection of subjects, and 
experimental procedures employed, and also outlines the methodology of Phase II. 

I.l The Order of Experimental Tasks 
I.l.l Phase I 

The first series of project activities has been directed toward the design of explanatory 

protocols for experimental test in Phase II. 

1. A circuit diagnosis problem (see below) was presented to individuals whose teaching 
should benefit from the proposed improved instructional paradigms. We used faculty in 
engineering and electronics at a local community college as instructors, and community 
college students with no background in engineering or mathematics as subjects. The 
instructors were briefed on the problem, and then asked to present material to prepare 
students to perform the indicated task. A variety of briefings were tested before a 
suitable one was determined. The first presented the logic box as a piece of equipment 
coming off an assembly line which was to be tested by the student. This briefing proved 
to be unsatisfactory because it elicited only explanations at the behavioral level, and 
could not be used to elicit any of the more complex cognitive levels. The second form of 
briefing presented the box as the control device of a set of sluices, and the final version 
elaborated this to a control device for an irrigation system providing varying mixtures of 
fertilizer and water. 

2. Five instructors were used as subjects in six experiments. All experiments were recorded 
on audio tape and then transcribed, yielding a total of 124 pages oi transcript for the 
instructor briefing and subsequent instructional session. (There are also debriefings for 
students and/or instructors for some sessions.) Each such session lasted between one- 
half and one hour. The first two experiments used the same instructor, and also used 
groups of community college students as an audience, 4 students in the first and 5 in the 
second. Instructors' presentations were recorded on audio tape. 

3. Following elicitation and transcription, we then analyzed the structure of the 
explanations obtained using current linguistic theory. This required studying linguistic 
strategies at the levels of the sentence and the discourse unit, and also studying the 
effect of different kinds of questions asked by students on the elicited explanation 
structure. This analysis makes use of the mathematical theory of discourse structure. 
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This analysis was used to identify significant variables and to formulate hypotheses 
about relations between linguistic /orm and task performance. These hypotheses are 
presented in Section 6. 

4. The last step in P^ase I was the preliminary selection of the most interesting hypotheses 
for experimental testing in Phase P. Selection is based on the following criteria: 
likelihood that the hypotLjsis will have significant effects on learning; possibility of 
incorporating and varying the nable of interest in a natural discourse; possibility of 
accurate measurement of the variable of interest; and possibility of holding the other 
variables present constant. 

1.1.2 Phase H 

In this phase of the project, we will first collect and analyze video-taped versions of our 
explanation task, since the analysis of the audio-taped experiments of Phase I indicated that 
some of the most theoretically interesting and practically important issues of multimedia 
instruction can only be studied using video data. We will then refine the hypotheses in the 
light of this data, and subject the most promising hypotheses to experimental validation. The 

tasks of this phase are the following: 

5. We will perform at least two video-taped sessions of the instructional task, and will 
analyze the forms of multimedia instruction using the theory of semiotic morphisms 
already developed. 

6. Based on the results of the Phase I, and task 4, we will make a final selection of the 
most promising hypotheses for testing. 

7. To test these hypotheses, standard variations of explanations of the circuit diagnosis 
problem will be administered to groups of learners. These may be given by actors,, via 
videotape, or by computer. While the cell design depends on the nature and number of 
independent explanation variables that emerge from Phase I, enough subjects will be 
tested to enable statistical generalization (e.g., at least 30 per cell of the design). 

8. If the results suggest it, follow up trials will be conducted with promising combinations 
of variables or setting (e.g., with vs. without interactive discussion; with vs. without 
diagrammatic aids). 

9. Dependent measures of effect (including both test and task performance) collected from 
learners will be examined by analysis of variance. In addition, effects will be assessed to 
determine the contribution (if any) of exogeneous variables such as age, education level, 
and previous work history on learners* response variables. 

10. Learning data will be examined primarily by means of analysis of variance where 
alternative explanatory approaches serve as independent variables and test (verbal) and 
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task (behavioral) outcomes serve as dependent variables. Associations among 
background variables and dependent measures Avill be assessed correlationally; 
exogeneous variables that importantly influence outcomes can be incorporated into 
major analyses as covariates. Results of these analyses are expected to contribute both 
to basic understanding of the effectiveness of alternative explanation approaches and to 
provide a foundation for recommendations addressed to the design of an automatic 
explanation generator, and to the improvement of instructional discourse. 

1.2 Procedures 

The task for both the elicited explanations and the learning trials makes use of the logic box 
of Figure 1. This box has two switches and four lights, each light being a logical function of 
the positions of the two switches. In Phase I, instructors were shown the box and how it 
works; they were then requested, in a nondirective manner, to provide an explanation of how 
it works to a "typical" group of students. 

In the second part of Phase II, groups of students will be presented with instructions about 
how the logic box works. Then each will be tested, both verbally and behaviorally, for 
comprehension. Phase I procedures have been administered to subjects (instructors) 
individually, while Phase II procedures apply to groups. Data from Phase I consists of 
verbatim protocols for linguistic analysis, while data from Phase 11 will consist primarily of 
standardized test and task scores for statistical analysis. 

L3 Subjects 

Subjects for Phase I procedures were community college instructors who are accustomed to 
giving explanations of circuit logic. Elicitation continued until a variety of patterns had been 
observed and replicated. 

Subjects for Phase 11 will be individuals (male and female) aged about 17 to 25 who are not 
specially trained or experienced in circuit logic. The N will be determined by the cells of the 
design for testing hypotheses generated in Phase I efforts. A minimum of 250 and a maximum 
of 550 subjects are anticipated. 
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n. Grammar of the Command and Control 
Speech Act Chain 

n.l Categories of the Grammar 

This appendix discusses the command and control speech act chain, a specific kind of speech 
act chain which is the most typical discourse type for aviation discourse. In the most general 
sense, these rules can serve as an example of how any speech act chain can be analyzed. More 
specifically, we have found that the command and control speech act chains characteristic of 
aviation discourse are formally identical to the speech act chains of instructional discourse, in 
the sense that the same formal grammar describes them. 

In the aviation context, operationally relevant speech act chains typically concern possible 
actions c actions that have already been performed. According to the usual definition [Searle 
69, Searle 71], speech acts can also be seen as linguistic acts that alter the perceived state of 
the world. This subsection presents a category system that includes both linguistic and 
physical acts; this is necessary for the formal description of the command and control speech 
act chains. 

The most general category is acts. This includes physical acts, command and control 
speech acts, and acknowledgements of such speech acts. A more specific category is 
speech acts, the basic category of interest for command and control. This category includes 
requests, reports, and declarations. 

Additional utterance categories of interest for the command and control speech act chain are 
plans and explanations, which may be embedded within a command and control speech act 
chain. (Plans and explanations as they occur in the command and control speech act chain 
are the same discourse types discussed in Section 2.1.) 

n.2 Sui>ordination i 

> ■ i 

This subsection discusses the elements' used to construct command and control speech act 
chains: These elements are of two types: the speech acts used in command and control; and 
the subordinators that indicate the relationships among them. The present discussion focusses 
on how these categories function within the formal grammar of command and control speech 
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act chains. An abbreviation for use in graphical representations is given for each 
subordinator; these abbreviation use "square brackets,* i.e., [...]. 

1. CHAIN: This node type is the top level subordinator for a sequence of command and 
control speech acts having the same major propositional content and constituting a 
command and control speech act chain. This node therefore marks the fact that a 
sequence of utterances is indeed a speech act chain; it is not usually indicated explicitly 
in the actual sequence of utterances. The abbreviation is simply [CHAIN). 

2. REQUEST: Requests are the most typical command and control speech acts. They 
include questions, commands and suggestions. (A command can be viewed as a request 
that has been ratified by the speaker with relevant authority.) In the formal grammar, a 
request must have the form of a request node subordinating a single subtree, which is the 
act that is requested. (Searle's taxonomy calls these "directives.") The abbreviation is 
(REQl. 

3. REPORT: A report is an indication of some state of the world. The abbreviation is 
[REP]. In the formal grammar, reports have the form of a [REP) node subordinating a 
single subtree giving the act or state reported, (lib) is an example. 

(11a) CAM-2 Kh, what's the fuel show now buddy? 
(lib) CAM-3 Five 
(11c) CAM-2 Five 

(1748:54) 

4. ACKNOWLEDGE: A command and control speech act (e.g., a request or declaration) 
can be acknowledged; but challenges, supports, and other acknowledgements cannot be 
acknowledged. (This is the kind of constraint on sequencing that the rules below are 
intended tq capture.) For example, (lib) is an acknowledgement. The abbreviation is 
[ACK|. An [ACK] node indicates the subordination of an acknowledgement to the 
speech act that it acknowledges. 

(12a) C-1 You gotta keep em running, Frostie 
(12b) C-3 Yes, sir 

(1808:42) 

Two interesting further points about [ACK) nodes are: (1) the speaker of an 
acknowledgement must be among the addressees of the request or report that it 
acknowledges; and (2) more than one addressee may produce an acknowledgement of the 
same speech act. 

5. STATEMENT/REASON: Subordinates a request or report on the left, and a reason 
supporting it on the right. It is abbreviated [ST/RSN]. It may also occur in the 
opposite order, abbreviated [RSN/ST]. 
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6. STATEMENT/CHALLENGE: Subordinates a request or report on the left, and a 

challenge to it on the right. It is abbreviated [ST/CH]. It may also occur in the 

opposite order, abbreviated [CH/ST]. 
7- GOAL/PLAN: Subordinates a goal on the left, and a plan to achieve it on the right. 

Abbreviated simply [GOAL/PLAN]. It may also occur in the opposite order, 

abbreviated [PLAN/GOAL]. 

n.3 Rules 

This subsection gives the rules of the grammar for command and control speech act chains in 
simple English, and also in a graphical form in Figure II-l. This grammar expresses how these 
speech act chains are constructed in real time. It thus defines the sequences of operationally 
relevant speech acts that are possible in command and control discourse, and indicates some 
(but not all) of the sequences that are not possible. It should be noted that this is a grammar 
of social force rather than of linguistic form; that is, the rules apply to the social 
interpretations of utterances, rather than to the utterances themselves, or to the sequences of 
words or sentences that comprise them. 

In this grammar, nodes that must subordinate other nodes have "square brackets," e.g., 
[ACK], and nodes that indicate categories that will later be filled have "pointed brackets," 
e.g., <REPORT>. The first two rules simply define subcategories of given categories. They 
are 

1. A command and control speech act, abbreviated <SPACT>, may be a request, a 
report, or a declaration, abbreviated <REQ>, <REPORT> and <DECL> 
respectively. 

2. An act, abbreviated <ACT>, may be a <SPACT>, an acknowledgement, or a 
physical act, abbreviated <ACK> and <PHACT> respectively. 

The basic entity being formalized, the command and control speech act chain, is indicated by 
a [CHAIN] node; all the speech acts that constitute a given chain will be subordinated to one 
such node. The beginning of the production of a command and control speech act chain is a 
single [CHAIN] node with two subordinate <SPACT> nodes; the fact that there are two 
such nodes expresses the fact that there must be at least two speech acts in a command and 
control speech act chain. The basic rule of development for command and control speech act 
chains is simply: 
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3. A [CHAIN] node with n descendent nodes can be elaborated into a [CHAIN] node with 
n+1 descendents. This expresses the fact that a command and control speech act chain 
may be of any length; that is, it may contain any number of speech acts. ^ 

The next two rules are basically parallel; they indicate how <REQ> and <REPORT> 
nodes can be elaborated: 

4. A <REQ> node can be expanded into a [REQ] node subordinating an <ACT> node. 
This means that any request is a request for an action, either a physical action or a 
speech act. 

5. A < REPORT > node can be expanded into a [REPORT] node subordinating an 
<ACT> node. This means that any report is a report of an action, either a physical or 
a speech act or of a state of the world. 

Next is a set of three rules that may be applied to any node [XX] that is either a [RF'^] or a 
[REPORT] node subordinating an arbitrary subtree: 

6. An [XX] node subordinating a subtree may be replaced by an [ACK] node subordinating 
[XX] with its subtree on the left, and an <ACK> node on the right. T a means that 
any report or request may be acknowledged. 

7. An [XX] node subordinating a subtree may be replaced by either: a [ST/RSN] node 
subordinating the [XX] node with its subtree on the left, and subordinating an 
<EXPL> node on the right; or a [RSN/ST] node with the same subordinate subtrees in 
the opposite order. This rule means that any report or request may be supported by 
giving a reason (RSN), having the formal structure of an explanation. 

8. An [XX] node subordinating a subtree may be replaced by either: a [ST/CH] node 
subordinating the pOC] node with its subtree on the left, and an <EXPL> node on the 
right; or else a [CH/ST] node with the same subordinates in the opposite order. This 
rule means that any report or request may be challenged by a speaker giving an 
explanation of why it is a bad idea. 

The final rule permits the introduction of planning. 

9. A [REQ] node subordinating an arbitrary subtree may be replaced by a [GOAL/PLAN] 
node subordinating the [REQ] node with its subtree on the right, and a <PLAN> fcode 
on the left. This means that any request may be incorporated as part of a plan; that is, 
the simple process of requesting an act can be elaborated into the process of planning. 

\ 

These rules are all given graphically in Figure II-l; graphical indications of focus of attention 
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are also given there. An extended example is given in the following subsection, illustrating 
how these rules are used to analyze an actual command and control speech act chain. 
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Figure II-l: Graphical Presentation of Command and Control Rules 
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n.4 An Example of a Command and Control Speech Act Chain 

The purpose of the preceding discussion has been to describe some constraints on command 
and control speech act chains, and in particular, to indicate some possible and impossible 
embeddings of social force. That is, we have attempted to specify what sequences of speech 
acts form command and control speech act chains, and what sequences do not. For example, 
an acknowledgement of a support of a request for an act A should not occur, although an 
acknowledgement of a request for an act A and a request for a support of a request for an act 
.A. may occur. 

To illustrate this kind of sequencing, let us consider the data in example (13): 

Hey Frostie 
Yes sir 

Give us & current card on weight figure about another 

fifteen minutes 
Fifteen minutes? 

Yeah give us three or four thousand pounds on top 

of zero fuel weight 
Not enough 

Fifteen minutes is gonna really run us low on fuel here 
Right 

(1750) 

First of all, (13a) and (13b) form what is termed a "call-response" pair, that is, a call for 
attention followed by an acknowledgement that the addressee is attending. Using the 
concepts of this study, this can be seen as a request having empty propositional content, 
followed by an acknowledgement; it cannot be seen as a command and control speech act 
chain, because chains must have more than one subordinate node. Thus the pair (13a-b) is 
indicated as shown in Figure II-l, where 0 indicates empty propositional content. Adding 
(13c-d) to this yields the tree shown in Figure 0-3, where c denotes the propositional content 
of (13c) and d that of (13d). 

(13e) refines this propositional content to say that there will be three or four thousand pounds 
in fifteen minutes, denoted here as e. This is followed by an unusually strong challenge in 
(13f), the propositional content of which, Not enough, is indicated by f in Figure 0-4. Rather 



(13a) CAM-1 

(13b) CAM-3 

(i3c) CAM-1 

(13d) CAM-3 

(13e) CAM-1 

(13f) CAM-3 

(13g) CAM-3 

(13h) CAM-? 
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/\ 

0 0 

Figure 11-2: A Call-Response Pair 
[CHAIN] 



/ \ 



[ACK] [ST/CH] ' I 

/V / \ 

0 0c d 
Figure 11-3: A Challenge 

than repeating the two subtrees of Figure II-3, we here denote them as tl and t2, 
respectively. 

[CHAIN] 
tl t2 [ST/CH] 

/ \ 

[REQ] f 



Figure 11-4: A Further Challenge 



ERIC 



Finally, (13g) is a supporting explanation of (13f), and (13h) is a support of (13g), and thus of 
(13f). Thus, the social force of this whole sfquence could be notated as in Figure 0-5, where g 
is the propositional content of (13g) and h that of (13k)^ 

9^ o5 / 
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[CHAIN] 

/l\ 

tl t2 [ST/CH] 



/ \ 



[REQ] [ST/RSN] 
e [ST/RSN] h 



/\ 



/ A 



Figure 11-5: A Complete Command and Control Speech Act Chain 
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